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Abstract. This paper has two themes that are intertwined: The first is the 
dynamics of certain piecewise afiine maps on R'" that arise from a class of analog- 
to-digital conversion methods called EA quantization. The second is the analysis 
of reconstruction error associated to each such method. 

EA quantization generates approximate representations of functions by se- 
quences that lie in a restricted set of discrete values. These are special sequences 
in that their local averages track the function values closely, thus enabling simple 
convolutional reconstruction. In this paper, we are concerned with the approxi- 
mation of constant functions only, a basic case that presents surprisingly complex 
behavior. An mth order EA scheme with input x can be translated into a dy- 
namical system that produces a discrete-valued sequence (in particular, a 0-1 
sequence) q as its output. When the schemes are stable, we show that the un- 
derlying piecewise affine maps possess invariant sets that tile R"" up to a finite 
multiplicity. When this multiplicity is one (the single-tile case), the dynamics 
within the tile is isomorphic to that of a generalized skew translation on T"*. 

The value of x can be approximated using any consecutive M elements in 
q with increasing accuracy in M. We show that the asymptotical behavior of 
reconstruction error depends on the regularity of the invariant sets, the order m, 
and some arithmetic properties of x. We determine the behavior in a number 
of cases of practical interest and provide good upper bounds in some other cases 
when exact analysis is not yet available. 



1. Introduction 

This paper is motivated by the mathematical problems exhibited in and sug- 
gested by a class of real-world practical algorithms that are used to perform analog- 
to-digital conversion of signals. There will be two themes in our study of these 
mathematical problems. The first theme is the dynamics of certain piecewise affine 
maps on M™ that are associated with these algorithms. The second theme is the 
analysis of the reconstruction error. While the first theme is somewhat independent 
of the second and is of great interest on its own, the second theme turns out to be 
crucially dependent on the first and is of interest for theoretical as well as practical 
reasons. 
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Let us start with the following abstract algorithm for analog-to-digital encoding: 
For each input real number x in some interval /, there is a map on a space S, 
and a finite partition n^; = {ilx,i, ■ ■ ■ , ^x,k} of S. For a fixed set of real numbers 
di < ■ ■ ■ < (Ik , and a typically fixed (but arbitrary) initial point uq E S, we define 
a discrete- valued output sequence q := Qx via 

q[n] = d, if u[n-l] := T;?~^no) G VL^^i- (1.1) 

We would like the mapping x i— > g to be invertible in a very special way: For an input- 
independent family of averaging convolutional kernels (pM G ^^(^)) M = 1,2, ... , 
we require that for all x S I, as M — > cxd 

{q * (pM)[n] — > X, uniformly in n. (1-2) 

For normalization, we ask the size of the averaging window (the support of (Pm) to 
grow linearly in M,^ and the weights to satisfy i?^m['T'] = 1. 

Note that such an encoding of real numbers is inherently different from binary- 
expansion (or any other expansion in a number system) in that, due to ()1.2|1 . equal 
length portions of the sequence q are required to be equally good in approximating 
the value of x. Hence, there is a "translation-invariance" property in the represen- 
tation. 

This setting is a special case of a more general one in which x = {x[n])n£Z is a 
bounded sequence taking values in / and 

q[n\ = di if u[n-l\ G Qx[n],i, (1-3) 

where we now define u[n] := 'J^.j„](u[n— 1]), and require that 

{q — x) * 4>M — > uniformly. (1-4) 

The basic motivation behind this type of encoding is the following intuitive idea: Let 
the elements x[n] be closely and regularly spaced samples of a smooth function X : 
M — > I. Since local averages of these samples around any point k would approximate 
x[k], i.e., X * 4>M ~ X for suitable 4>m^ (|1.4|) would then imply that the sequence x 
(and therefore the function X) can be approximated by the convolution q * (/>m. 

Such analog-to-digital encoding algorithms have been developed and used in elec- 
trical engineering for a few decades now. Most notable examples are the SA quan- 
tization (also called SA modulation) of audio signals and the closely related error- 
diffusion in digital halftoning of images. There are several sources in the electrical 
engineering literature on the theoretical and practical aspects of SA quantization 
[6| llflll2T] . Digital halftoning and its connections to SA quantization can be found in 
[H l2l HI [Tol I25j . Recently, SA quantization has also received interest in the math- 
ematical community, especially in approximation theory and information theory, 
since a very important question is the rate of convergence in (|1.4|) [5l lOj I13( I15j. 

We give in Section |2l the original description of an mth order EA modulation 
scheme in terms of difference equations. The underlying specific map which we 



It will be of interest to use infinitely supported kernels as well. We will define the necessary 
modifications to handle this situation later. 
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then refer to as M^;. (the "modulator map" ) is described in Section IH is the 
piecewise affine transformation on S = defined by 

M^{^r)=L^r+{x-di)l if v G (1.5) 

where L := is the mxm lower triangular matrix of I's and 1 := Im ■= 
(1, . . . , 1) G Z"*. Each SA scheme is therefore characterized by its order m, the 
partition U^, and the numbers {di}. A scheme is called A:-bit if the size K of the 
partition satisfies 2^^^ < K < 2^. If the numbers {di} are in an arithmetic 
progression, this is referred to as uniform quantization. As a consequence of the 
normalization Xln'^^AfM = the input numbers x are chosen in / C [dijdx]- A 
scheme is said to be stable if for each x, forward trajectories under the action of 
are bounded in M™". (More refined definitions of stability will be given in Section H) 
The partition Hx is an essential part of the algorithm for its central role in stability. 

It is natural to measure the accuracy of a scheme by how fast the worst case 
error ||(a; — q) * (j)M\\<xi converges to zero. It is known that for an mth order stable 
scheme, and an appropriate choice for the family 3^ = {(Pm} of filters,^ this quantity 
is 0{M~^) [S]. The hidden constant depends on the scheme as well as the input 
sequence x. Here, the exponent m is not sharp; in fact, for m = 1 and m = 2, 
improvements have been given for various schemes jl3lll4j . We will review the basic 
approximation properties of SA quantization in Section [21 

In applications, it is also common to measure the error in the root mean square 
norm due its more robust nature (this norm is defined in Section . It is known 
for a small class of schemes we call ideal, and a small class of sequences (basically, 
constants and pure sinusoids) that this norm, when averaged over a smooth distri- 
bution of values of x, has the asymptotic behavior 0(M~"*~^/^) [Hl^DEl- The 
analyses employed in obtaining these results rely on very special properties of the 
ideal schemes, such as employing an (effective) m-bit uniform quantizer for the mth 
order scheme. It was not known how to extend these results to low-bit schemes (in 
particular, 1-bit schemes) of high order for which experimental results and simula- 
tion suggested similar asymptotical behavior for the root mean square error. 

It is the topic of this paper to provide a general framework and methodology to 
analyze SA quantization in an arbitrary setup (in terms of partition and number 
of bits) when inputs are constant sequences. With regards to the first theme of this 
paper, we prove in Section [S] that the maps have an outstanding property of 
yielding tiling invariant sets, up to a multiplicity that is determined by the map. In 
the particular case of single tiles being invariant under (which also appears to 
be systematically satisfied by all practical SA quantization schemes), we develop a 
spectral theory of SA quantization. This constitutes the second theme of the paper. 
The particular consequence of tiling that enables our spectral analysis is presented 
in Section El The resulting new error analysis for general and particular cases is 
presented in the remainder of the paper. 



We shall adopt the electrical engineering terminology "filter" to refer to a sequence (or function) 
that acts convolutionally. 



4 



NGUYEN T. THAO AND C. SINAN GUNTURK 



Some notation. The symbols M, Z, and N denote the set of real numbers, the set 
of integers and the set of natural numbers, respectively. T denotes the set of real 
numbers modulo 1, i.e., T = M/Z. Functions on that are 1-periodic in each 
dimension are assumed to be defined on via the identification T = [0,1), and 
functions defined on are extended to M"^ by periodization. 

Vectors and matrices are denoted in boldface letters. Transpose is denoted by an 
upperscript T. The j'th coordinate of a vector v is denoted by vj, unless otherwise 
specified. Sequence elements are denoted using brackets, such as in cj = (^Lu[n\)^^^. 
The sequence oj denotes time reversal of to defined by ib[n] := u)[—n], and the symbol 
* is used to denote the convolution operation. 

We define two types of autocorrelation. The autocorrelation Af is defined for 
square integrable functions (or sequences) /, by the formula 

Af{t) = if * m = I f{oW+t)d^. 

The autocorrelation p^^ is defined for bounded sequences (or functions) w, by the 
formula 

I ^ 

p^[k] = lim — Vu;[n]a;[n + k], 

A"— >oo iV ^ — ' 

n=l 

provided the limit exists. 

The Fourier series coefficients of a measure p on T are given by 

p[n] := [ e-2™«d^(e). 
Jt 

Whenever convenient, the Fourier transform of a sequence s = {s[n])n£z will be 
denoted by the capital letter 5, i.e., S = s. 

The "big oh" / = 0(g) and the "small oh" / = o{g) notations will have their usual 
meanings. When constants matter, we also use the notation / <q, g to denote that 
there exists a constant C that may depend on the parameter (or set of parameters) 
a such that f < Cg. We write f ^ g f < g and g ^ f, which is the same as 

/ = Qig)- 

2. Basic theory of SA quantization 

In this section, we describe the principles of SA quantization (modulation) via 
a set of defining difference equations. The description in terms of piecewise affine 
maps on will be given in Section 0J Although the schemes representable by these 
difference equations do not constitute the whole collection of algorithms called by 
the name SA modulation, they are sufficiently general to cover a large class of 
algorithms that are used in practice and many more to be investigated. 

Let m be the order of the scheme, and x = {x[n])n£Z be the input sequence. Then 
a sequence of state- vectors, denoted 
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Figure 1. Block diagram of an mth order SA modulator. 



and a sequence of output quantized values (or symbols), denoted q[n], n = 1,2,..., 
are defined recursively via the set of equations 

q[n] = Q{x[n],\i[n-l\), 
ui[n] = ui[n—l] + x[n\ — q[n] (2-1) 
Uj\n\ = Uj{n—1\ + Uj-i[n\, 2 < j < m, 

where the mapping Q : {di, . . . , dx}, called the quantization rule, or simply 

the quantizer of the SA modulator, is specific to the scheme. In circuit theory, these 
equations are represented as a feedback-loop system via the block diagram given in 
Figure ^ 

In addition to producing the output sequence q, the role of the quantizer Q of a 
SA modulator is to keep the variables Uj bounded. A more precise definition of this 
notion of stability will be given later. Let us see how boundedness of Uj results in 
a simple reconstruction algorithm. It can be seen directly from (|2.1|) that for each 
j = l, 



, m, the state variable Uj satisfies 



'J' 



where A is the difference operator defined by (Af)[n] = v[n] — v[n- 
j = 1, and assume that x is constant. From this, it follows that 
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This means that simple averaging of any M consecutive output values q[k] yields a 
reconstruction within 0{M~^). 
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This approximation result can be generalized easily. For simplicity of the discus- 
sion, let us assume that the difference equation (|2.2j) is satisfied on the whole of Z 
(with some care, this can be achieved via backwards iteration of (|2.1|) ). For a given 
averaging filter (/) S with ^ (pIn] = 1, let 

n 

e^^^ := X - q * (/) (2.4) 

be the error sequence. Since x is a constant sequence, we have x = x*(j). Therefore 

ex,0 = {x-q)*(l)= (A™n„) *(j) = u^* (A"*(/>), (2.5) 

where at the last step we have used commutativity of convolutional operators. From 
this, we obtain 

||e.,<^|L<||n^||^||A-c/.||^. (2.6) 

It is not hard to show that there is a family of averaging kernels </'^'' (which can 
be, for instance, discrete B-splines of degree m) with support size growing linearly 
in M such that ||A™(/)^^||i < CmM""*. Combined with ((THI), this yields the bound 
0{M~^) on the uniform approximation error. A proof of this result in the more 
general setting of oversampling of bandlimited functions can be found in [HI I12j. 

3. Mean square error and its spectral representation 

For the rest of this paper, we shall be interested in the mean square error (also 
called, the time- averaged square error) of approximation defined by 

1 ^ 2 

£(x,(/>) := lim T7V|ex,<^N| , (3.1) 

A*— >oo iV ' 
n=l 

provided the limit exists (otherwise the lim is replaced by a limsup). The root mean 
square error is defined to be y^£(a;, (j)). For convenience in the notation, we shall 
work with £.{x,(j)). 

The mean square error enjoys properties that are desirable from an analytic point 
of view. The definition of autocorrelation sequence yields an alternative description 
given by 

£(x,<A)=Pe..,[0]. (3.2) 

Using the formula (|2.5j) and the standard relation p^^^g = Puj * g * g whenever 
exists and g & l^, we find that 

£(x, 0) = (p„^. * {A^ct>) * (Ai5))[0]. (3.3) 

This formula is valid for any j = 1, . . . , m, provided pu^ exists. In fact, it suffices to 
compute pu^ only, since Uj = A"^~^Um yields 

Pu^ = /V'-U A^-j * p^^. 

We shall abbreviate by p„. 

The computation of £(x, 4>) can also be carried out in the spectral domain. Since 
Pu is positive-definite, it constitutes, by Herglotz' theorem ^1 p. 38], the Fourier 
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coefficients of a non- negative measure n := on T (the power spectral measure), 
i.e., 

Pu[k]= [ e-2-'=«d/.(e). (3.4) 
Elementary Fourier analysis yields the spectral formula 

£(x,,^)= / \2sm{7TOr\m\'MO, (3.5) 
where ^ has the absolutely convergent Fourier series representation 

^(e) = E'^N^'™^- 

n 

This computational alternative is effective when the measure fi has a simple de- 
scription. On the other hand, it may happen that this measure is somewhat complex 
to compute with directly. As we shall demonstrate, there will generally be a pure 
point (discrete) component /ipp (i.e. a weighted sum of Dirac masses), and an ab- 
solutely continuous component ^Uac yielding a spectral density s(-) S L^{T), where 
d/Xac(0 = .s(^)d^. The continuous singular component will be nonexistent. We shall 
analyze these two components via their Fourier coefficients. Under certain condi- 
tions, we will be able to describe both components explicitly and compute either 
asymptotics or sharp bounds for (j) = (pM as M — > c«. 

4. PlECEWISE AFFINE MAPS OF SA QUANTIZATION 

In this section, we study the difference equations of SA modulation as a dynamical 
system arising from the iteration of certain piecewise affine maps on R"^. It easily 
follows from the first two equations in (|2.1|) that 

j 

Uj[n]='^Ui[n-l] + {x[n]-q[n]), I < j < m, (4.1) 
1=1 

or in short, 

u[n] = Lu[n-1] + {x[n] - q[n])l, (4.2) 

where L := is the mxm lower triangular matrix of I's and 1 := Im ■= 
(1, . . . , 1)^ E M™. Using the definition of q[n\, we introduce a one-parameter family 
of maps {'Mx}x€M. on defined by 

M^(v) :=Lv+(x-Q(a;,v))l. (4.3) 

Hence, the evolution of the state vector u[n] is given by 

u[n] = M,[„](u[n-l]). (4.4) 

According to the formulation presented in the introduction, the elements of the 
partition n^, are then given by ^}x,i = {v G R'" : Q{x, v) = dj}, and the expression 
(|4.3|) is equivalent to (|1.5j> . For the rest of the paper, we shall assume that x[n] = x 
is a constant sequence so that 

u[n]=MS(u[0]), (4.5) 
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and 

q[n]=Q{x,M:~\n[0])). (4.6) 

A variety of choices for the quantizer Q have been introduced in the practice of 
SA modulation. Most of these are designed with circuit implementation in mind, 
and therefore necessitate simple arithmetic operations, such as linear combinations 
and simple thresholding. A canonical example would be 

Qo{x,v) = [aox + aivi H h amVm + /?oJ + Pi, (4.7) 

where the coefficients Oi and /3j are specific to each scheme. We will call these 
rules "linear", referring to the fact that the sets Qx,i are separated by translated 
hyperplanes in W"^. There has also been recent research on more general quantization 
rules and their benefits El • 

Typically, an electrical circuit cannot handle arbitrarily large amplitudes, and 
clips off quantities that are beyond certain values. This is called overloading. In 
this case, the effective mapping Q is given by 

{Qo{x,v) if Qo{x,v) £ {di, . . . ,dK}, 
di if Qo{x,v)<di, (4.8) 

dx if Qo{x,'v) > dK- 

For the rest of the paper, we assume that the di form a subset of an arithmetic pro- 
gression of spacing 1, such as the case for the rule 1)4. 7() . Since we can always subtract 
a fixed constant from x and the di, we also assume, without loss of generality, that 
the di are simply integers. We shall be most interested in one-bit quantization rules, 
i.e., rules for which Ran((5) = {di, ^2}- Let us mention that one-bit SA modulators 
are usually overloaded by their nature. 

Let us emphasize once again that the quantization rule is crucial in the stability 
of the system. For a given x, we call a SA scheme defined by the quantization rule 
Q{x, •) orbit stable, or simply stable, if for every initial condition u[0] in an open set, 
the forward trajectory under the map "Mx is bounded in R'", and positively stable, 
if there exists a bounded set Fq C M*" with nonempty interior that is positively 
invariant under M^;, i.e., M2;(Fo) C Fq. These two notions are closely related. 
Clearly, positive stability implies stability. On the other hand, in a stable scheme, if 
the forward trajectories of points in an open set are bounded with a uniform bound, 
then this would also imply the existence of a positively invariant bounded set. In 
practice, it is also desirable that stability holds uniformly in x. However, we shall 
not need this kind of uniformity in this paper. 

In Figure El we depict a positively invariant set Fq under the map "Mx which is 
defined by a one-bit linear rule in M^. The set Fq was found by a computer algorithm. 
In general, constructing positively invariant sets for these maps is a non-trivial task 
|23| I26j . Despite the presence of a vast collection of SA schemes that are used in 
hardware, only a small set of them are proved to be stable. Most of the engineering 
practice relies on extensive numerical simulation. 

In Figure 121 we also show in decreasing brightness the forward iterates of Fq given 
by Ffc = M^(Fo). These sets converge to a limit set F, or the attractor, which is 
shaded in black. These invariant sets are the topic of discussion of next section. 
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Figure 2. The decreasing family of nested sets Tk = M^(ro) in- 
dicated by decreasing brightness. The Umit set T is invariant (see 
Theorem I5.1|l . 

To avoid heavy and awkward notation, we shall drop the real parameter x from 
om' notation except when we need it for a specific purpose or for emphasis. It must 
be understood, however, that unless noted otherwise, all objects that are derived 
from these dynamical systems generally depend on x. 

5. Stability implies tiling invariant sets 

In this section we prove a crucial property of the dynamics involved in positively 
stable SA schemes. This is called the tiling property and refers to the fact that there 
exist trapping invariant sets that are disjoint unions of a finite collection of disjoint 
tiles in M"*. Here a tile, or a Z^-tile, means any subset S of M™ with the property 
that {S + k}kGZ™ is a partition of M*". Later in the paper, this property will lead 
us to an exact spectral analysis of the mean square error when the multiplicity of 
tiling is one. 

We consider a slightly more general class of piecewise affine maps M := on 
W^, which are defined by 

M(v) =yi3;,i(v) := Lv + xl +di if ^r e (5.1) 

where L is the lower triangular matrix of all I's, and {^x,i}j^i is a finite Lebesgue 
measurable partition of M"*, and dj G Z"' for all i = 1, . . . , K . When dj = —dil, 
these maps are the same as those that arise from SA quantization. 
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Theorem 5.1. |21] Assume that there exists a bounded set Tq C M™" that is positively 
invariant under M, i.e., M(ro) C Tq. Then, the set F C Fq defined by 

F := fl M'=(Fo) (5.2) 

fe>0 

satisfies the following properties: 

(a) M(F) = F, 

(b) i/Fo contains a tile, then so does F. 

Proof. This was previously proved in . For completeness of the discussion, we 
include the proof here. 

(a) Clearly, M(F) C F C Fq since Fq is positively invariant. We need to show 
that F C M(F). Let V e F be an arbitrary point. Define F^ := M''(Fo), A; > 0. 
The sets T}^ form a decreasing sequence, and so is the case for the sets := 
M~"'^(v) n F/j. Note that M~"'^(v) is always finite since there are only finitely many 
Ax/s in the definition of M, each of which is 1-1. (F^ would be finite even if there 
were infinitely many sets flx,i because inverse images under M have to differ by 
points in Z*" and only finitely many of them can be present in Ffc.) On the other 
hand v E F^+i = M(Ffc), therefore v has an inverse image in F^, i.e., is non- 
empty. Since Fk form a decreasing sequence of non-empty finite sets, it follows that 
M-^v) n F = nfc>o Fk + 0, i.e., v e M(F). Hence F C M(F). 

(b) Let Fq contain a tile Go, and define Gfc = M'^(Go). Each is a tile. To 
see this, note that for any given i, Ax,% maps tiles to tiles, and for all v G M™', 
M(v) — yia;.i(v) € Z™' so that M maps tiles to tiles as well. For an arbitrary point 
w e M™, define the decreasing sequence of sets = (Z"* -|- w) n F^. Because Fq 
is bounded, each 11^ is finite. On the other hand, Fy^ D implies that each F^ 
contains a tile, yielding iJ^ / 0. Hence (Z™ -|- w) n F = nfc>o 7^ 0- Since w is 
arbitrary, this means that F contains a tile. □ 

In what follows, measurable means Lebesgue measurable, and rri(S') denotes the 
Lebesgue measure of a set S. 

Theorem 5.2. Under the condition of Theorem I5.il assume moreover that x is 
irrational and that Fq is measurable of non-zero measure. Then, the set F defined in 
(|5.2j) differs from the union of a finite and non-empty collection of disjoint IJ^-tiles 
by a set of measure zero. 

Proof. Clearly, F is measurable since M is piecewise affine. Let us show that 
Lebesgue measure on F is invariant under M. From now on, we identify M with its 
restriction on F. From Theorem 15.11 M(F) = F which implies ]V[~^(F) = F as well. 
Let us define A to be the set of points in F with more than one pre-image. A is 
measurable, simply because 

A = yj M(F n a) n M(F n 

We claim that vc\.{A) = 0. Definition of M implies that M preserves the mea- 
sure of sets on which it is 1-1. Since M is 1-1 on M~^(F\74), we have m(F\74) = 
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m(M ^(r\j4)). On the other hand, since each point in A has at least 2 pre-images, 
we have 2m(^) < xn{M~^{A)). This imphes 

2m{A) < m{M~\A)) = m{M~\T)) - m{M~\T\A)) = m{T) - m{T\A) = m{A). 

Therefore m{A) = m(M~^(yl)) = 0. Hence, for any B C T, the disjoint union 
B = {BnA)U {B\A) yields 

m{M-^{B)) = m{M~^{B n A)) + m{M-^{B\A)) = m{B\A) = m{B), 

i.e., M preserves Lebesgue measure on F. 

Let TT : r — > be the projection defined by 7r(v) = (v). Here we identify 
[0, 1)*" with T™. Let v be the transformation of the measure m|r on under 
the projection tt, which is defined on the Lebesgue measurable subsets of T*" by 
v{B) = m{7r~^{B)). Let H = he a generalized skew translation on T™" defined 
by 

iLv := Lv + xl (modi). (5-3) 
Note that vrM = XLvr. Hence, for any measurable B C T™", we have 

HL-\B)) = m{7r-^Ji'\B)) = m{M.-\~HB)) = m(^-^(B)) = iy{B), 

i.e., is invariant under L. 

At this point, we note that when x is irrational, £ is uniquely ergodic, i.e., there 
is a unique normalized non-trivial measure invariant under which, in this case, is 
the Lebesgue measure. (See, for example, [7|, [211 p. 17] for m = 2, and ^1 p. 159] 
for general m.^) Hence, u = cm for some c > 0; this includes the possibility of the 
trivial invariant measure z/ = 0. 

For each j = 0, 1, . . . , define 

Tj = {v G : card(7r-i(v)) = j}. 

{Tj}j>o is a finite measurable partition of T'". The finiteness is due to the fact that 
F is a bounded set and measurability is simply due to the relation 

7; = |vGM™ : Yl Xr+Jv)=jj. 

Note that 

cmiTj) = u{Tj) = m{n-\Tj)) = jm{Tj). 
This shows that there cannot exist two such sets Tj and Tj both with non-zero 
measure. Hence, there exists a (unique) j, namely, j = c, such that m(T"^\rj) = 0. 
This implies that F is the union of j copies of T™, possibly with the exception of a 
set of zero measure. 

Let us now show that j > 1. Consider Sq := 7r(Fo) C T™". Since Fq is positively 
invariant, we find that £j(So) = 7rM(Fo) C 7r(Fo) = Sq. Since XL is 1-1, we have 
^-^Eo) D So. Hence, ^-^i^o) A Sq = li~\^o)\^o = ii~'^{^o\ii{^o))- This 
implies, since £ is measure-preserving, 

miL-\^o) A So) = m(/:-i(So\i:(So))) = m(So\i:(So)) = m(So) - m(/:(So)) = 0. 

■^Here, unique ergodicity is stated for the map {vi, . . . , Vm) i—>-{vi+x,V2+vi,..., v,n + Vm-i), 
which is easily shown to be isomorphic to L. 
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Figure 3. Represented in black is the invariant set F of a 1-bit 
2nd order scheme whose partition is determined by the cubic curve 
shown in the figure. The copies in gray are the translated versions of 
r by (1,0) and (1, 1), respectively. In this example, each connected 
component of T is also invariant. 

Ergodicity of L implies that m(So) is or 1. The first case is not possible, since 
each point in Eq has at most finitely many inverse images under vr"^ and this would 
violate m(ro) > 0. Therefore m(So) = 1, implying that j > 1- □ 

When X is irrational, Theorem l5 . 21 improves Theorem l5.ir b) in two aspects. First, 
the outcome is that F not only contains a tile, but in fact it is composed of disjoint 
tiles, up to a set of measure zero. Second, to conclude this, it suffices to check that 
Fq has positive measure, instead of the stronger (though equivalent) requirement 
that Fq contain a tile. On the other hand, Theorem IS.lf b) is still interesting due 
to its algebraic nature: It can can be used to test if F contains an exact tile (i.e., 
7r(F) = T"^), and it remains valid even when x is rational. 

Let us also note, as an application of Theorem 15.21 that whenever a positively 
invariant set Fq of (for irrational x) can be found with < m(Fo) < 2, then the 
invariant set F is a single tile. 

In Figure 13 we show an illustration of an invariant set which is composed of 
two tiles. In this example, the EA scheme is 1-bit 2nd order and the partition is 
determined by a cubic curve. 

6. The single-tile case and its consequence 

Since the initial experimental discovery of the tiling property in |121ll4j . we have 
observed that the invariant sets F resulting from practical stable second order SA 
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Figure 4. Representation in black of several consecutive state 
points u[n] of various second order SA modulators with the irra- 
tional input X w 3/4. The copies in gray are the translated versions 
of the state points by (1,0) and (1, 1), respectively. 



schemes systematically appear to be single tiles. We show in Figure |3 experimental 
examples of F on second order schemes. In FigurelHl we show the set F in three cases 
where its explicit analytical derivation has been possible jTl]. (In these particular 
cases, F is actually proven to be an exact tile.) A fundamental question is to 
characterize maps M^- which yield a single invariant tile. In this paper, we will simply 
assume that this condition is realized. As will be seen, a whole new framework of 
analysis will be generated from this particular situation. 

A tile F intrinsically generates a unique projection : M™ — > F such that 
V— (v)p G Z™ for all v G M™. The restriction of this Z^'-periodic projection to 
the unit cube [0, 1)™" (which we identify with T'") is a measure preserving bijection 
(note that the inverse of (•)p : T™' — > F is the map vr that was defined in the proof of 
Theorem 15. 2|) . When F is invariant under M, the map (•)p : T"^ — > F establishes an 
isomorphism between M on F and the affine transformation XL := on T"^ defined 
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V 




(a) 2-bit "linear" with (^1,^2,^3,^4) = (-1,0,1,2). (x = 0.5) 




(b) 1-bit "linear" with (^1,^2) = (0, 1). (x 0.52) 





"2 










/ Ox 


















/ °^ 










/ / 













(c) 1-bit "quadratic" with (^1,^2) = (0, 1). {x = 0.74) 

Figure 5. Three families of quantization rules for which the tiling 
property was proven in jl4j with parametric explicit expressions for 
the corresponding invariant sets. 



ERGODIC DYNAMICS IN EA QUANTIZATION 



15 



by (|5J-ij) . Indeed, the definition of £j easily yields £(v)— M((v)p) G Z™. Hence, 

(£(v)), =M((v),), 
or in other words, the fohowing diagram commutes: 



r 



(■)r 



(■)r 



M 

The first important consequence of single invariant tiles is that it reduces the dy- 
namical system M to the much simpler £j whose n-fold composition can be computed 
explicitly. It follows that if u[0] G F, then 



u[n] = M"(u[0]) = (£"(u[0]))r = ^L"u[0] + xs[n]j , (6.1) 
where s[n] := Sm[n] is defined by 

\fc=o / 

It is an easy computation to show that the jth coordinate of s[n], which we denote 
by Sj[n], is equal to (''^j"^)- 

The second important consequence is that if x is an irrational number, then M^; 
on r inherits the ergodicity of Hx via the isomorphism generated by (•)p. Since 
: T™ — > r preserves Lebesgue measure, M.x is then ergodic with respect to the 
restriction of Lebesgue measure on T. Hence Birkhoff Ergodic Theorem yields 

Proposition 6.1. Let x be an irrational number and T be a Lebesgue measurable 
l/^-tile (up to a set of measure zero) that is invariant under M. Then for any 
function F G L^{T), 

N 

hm l^F(u[n])= f F(v)dv= f F((v),)dv (6.3) 

for almost every initial condition u[0] G T. 

This formula will be the fundamental computational tool for the analysis of the 
autocorrelation sequence pu- For the remainder of this paper, we shall assume that 
we are working with quantization rules for which the invariant sets are composed of 
single tiles. This will save us from repetition in the assumptions of our results. How- 
ever, it will also be important to know certain geometric features of these invariant 
tiles. We will state these explicitly when needed. 

7. Analysis of the autocorrelation sequence p„ 

Let P(v) = Vm be the projection of a vector v G M'" onto its mth coordinate. If 
we define the function 

Ffc(v) = P(v)P(M^(v)), (7.1) 
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then it follows that 

Um[n]um[n + k] = P(u[n])P(M^'(uN)) = Fk{u[n]), 
and therefore Proposition 16. II gives an expression for the value of /^^[A;]: 

Pu[k] = [ Ffe(v)dv= / Ffc((v)^)dv. (7.2) 

A direct evaluation of pu[k] in either of these forms is not easy, because the /c-fold 
iterated map M'^ as well as the invariant set T are implicitly-defined and complex 
objects. The problem can be somewhat simplified via the conjugate map H^. Indeed, 
one has 

Fk o (•), = (P o (•),) (P o o (.)^) = (P o (.)^) (P o (.)^ o 
so that if we define 

Gr = Po(.)„ 
then via (|6.1|) . we obtain the formula 

Pu{k]= [ Gr(v)Gr(^'(v))dv= /" Gr(v)Gr(L'=v + xs[A;]) dv, (7.3) 

which now only depends on F. 

As it is standard in the spectral theory of dynamical systems (see, e.g., [22]), let 
U := Uti be the unitary operator on L2(T'") defined by {Uf){^/) = /(i:(v)). Then 
(|7.3|) reduces to 

p^[k] = {G,M'Gr)^^^^^^. (7.4) 

For any / G L^(T'"), the inner products (/, f) j^2(jmy A: G Z, define a positive- 
definite sequence so that there exists a unique non-negative measure on T with 
Fourier coefficients 

Uf[k] = (f,U'^f)^_^^ (7.5) 



for all k € "Z. Note that when / = Gr, it follows from (|7.4() that the corresponding 
measure i^Cr — P^ where p is the spectral measure that was mentioned in Section |3J 
with fi = pu- 

7.1. Decomposition of the mixed spectrum: General results. We shall sepa- 
rate the autocorrelation sequence pu into two additive components that result from 
two different types of spectral behavior. Using the spectral theorem for unitary 
operators, we decompose L^(T"') into two ^//-invariant, orthogonal subspaces as 
L2(T"^) = 5{pp e 5f„ where 

3^pp = {/ £ -L^(T™) : z^/ is purely atomic}, 

which is also equal to the closed linear span of the set of all eigenfunctions of lA, and 

5fc = ?{pp = {/ G L^(T™) Vf is non- atomic (continuous)}. 

In the particular case of the transformation £, it turns out that every spectrum 
on IKpp is absolutely continuous (see Appendix A). Therefore we denote 'Kc by 3iac- 
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Any / G L^(T'") can be uniquely decomposed as / = /pp + /ac, where /pp E iKpp 
and /ac G ^Hac- For XL, it is also known, as we show in Appendix A, that 

:Kpp = {/ G L2(T'") : /(v) only depends on vi}, 

and the orthogonal projection of / onto !Kpp is given by 

/pp(v)= / fiviy)dv'. (7.6) 

For notational convenience, we write / := /pp and / := /ac for any / G L^(T). 
For / = Gr, we now consider the decomposition 

Gr = Gr + Gr. (7.7) 
Because of orthogonality and Z//-invariance of !Kpp and Jfac ) (|7.4|) implies that 

providing the decomposition 

Pu= Pu + Pu- 

Here, using formula (|6.1j) and the fact that functions in the subspace 3ipp depend 
only on the first variable, we obtain 

Pu[k] = (Cr, W'gA , = / Gv{vi)Gt{vi + kx) dvi (7.9) 

and 

Pu[k] = (Gr,U^Gr) ^ =[ Gr(v)Gr(L'=v + xs[A:]) dv. (7.10) 

This decomposition provides the Fourier coefficients of the pure-point /ipp and the 
absolutely continuous ^ac components of the spectral measure, respectively. It also 
yields an explicit simple formula for ^pp in terms of the Fourier coefficients of Gr- 
We have 

Theorem 7.1. 

Ppp = \Gr[n\ 6nx, (7.11) 
where 5a denotes the unit Dirac mass at a £T. 

Proof. Let ly denote the measure given on the right hand side of 1)7. 11() . It suffices 
to verify that i^lk] = Pu[k] for all k £ Tj. We find by direct evaluation that 

Clearly, this is the Fourier expansion of the even-symmetric function 

AgAO= I Griv)Griv + Odv 
Jt 

evaluated at ^ = —kx. Therefore i>[A:] = Pu[—k] = Pu[k]. □ 
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Note: It is easy to see that this result holds for any function / G L^(T"^) in the 
sense that 

(^/)pp = 2^ |/ppN ^nx- (7.12) 

On the other hand, the computation of //ac is not easy. Since absolute continuity 
implies an integrable density s(-), where dna_c{0 — ■s(Od^) can immediately 
say by the Riemann-Lebesgue lemma that the Fourier coefficients pu[k] ^ as 
\k\ — > oo. However, the rate of decay is determined by the geometry of F. We shall 
be particularly interested in the case when the density is continuous at ^ = 0. 

7.2. Properties of pu for the class of Vm-connected invariant tiles. In this 
section, we derive explicit formulae for the Fourier coefficients pu[k] of the continuous 
component of the spectral measure p when the invariant tile F has certain geometric 
regularity. For a given tile F for M"^, let us define 

Ar:= U r + (k',0), (7.13) 

k'ez™-i 

and for any v' G R"^"-*^, 

Ar(v') := P(Ar n {v'}xM) = G M : ,v„,) G Ar}. (7.14) 

Proposition 7.2. For each v' G IR™~^, the set Ar(v') is a tile in R with respect to 
Z-translations, and 

Gr{v',Vm) = {Vm)Ar{v')- (7.15) 

Proof. Since F is a tile, the collection of sets {Ar + (0, fc) : k £ Z} forms a partition 
of R™. Therefore for any v' G R"^~"^, the Vm-section of this collection given by 
{Ar(v') + /c : /c G Z}, is a partition of M. This shows that Ar(v') is a tile. For the 
second part of the claim, let v = (y',Vm)- The definition of P immediately yields 

(v',Gr(v)) = (v',P((v),)) = (v), + (k',0) 

for some k' G Z*""^. This says that (v', Gr(v)) G Ar and therefore Gr(v) G Ar(v'). 
The result follows since Gr{v' ,Vm)—Vm G Z. □ 

Definition 7.3. We say that a tile F C is Vm-connected if for each v' G M™"-*^, 
the one dimensional tile Ar(v') is a connected set, i.e. a unit-length interval. In 
this case, we denote by Ar(v') the midpoint o/Ar(v'). 

In Figure we display examples of the function Ap for various schemes. For the 
examples in (a), (c) and (d), the tile is U2-connected. Note that Um-connectedness 
of a tile is different from its Vm cross-sections being connected. 

Let us use the shorthand notation (a)^ := (a)[_i i) = (a + |) — ^. For a Vm- 
connected tile, we have the following simple observation: 

Corollary 7.4. // the tile F is Vm-connected, then for any V G R*""^ 

G'r(v', vra) = {vrr. " Ar(v'))o + Ar(v'), (7.16) 

and Gr = Ar- 
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Proof. If r is Um-connected, then Ar(v') = [Ar(v') — ^, Ar(v') + ^). The first result 
fohows from Proposition 17.21 and the identity ^^+1) = {(3 — a)^ + a which 

holds for any a and /?. The second result is a simple consequence of the fact that 
the first term integrates to zero over Vm- D 



Before we state the following proposition, let us note that the matrix Ij^ := 
can be decomposed as 



/ 



V 



T 



\ 



J 



(7.17) 



since S77j,[A^] sa-tisfies S777,[/l] — Lt^jSttj [A; - 1] + Im with s,„[0] = 0. 

Proposition 7.5. Let the invariant tile T be Vm- connected. Define for each k G 
and v' G E'"-\ 



5ffc(v') = Sm-i[k] ■ v' + xs,n[k] - Ar(L^_iv' + xsm-i[k]) + Ar(v'). 



Then 



Pu[k] 



A(.)j9fc(v'))dv' + (Ar,Z^'=Ar 



(7.18) 



In particular, if m = 2 or ifP{T) is an interval of unit length, then the second term 
drops. 

Proof. We employ Corollary 17.41 for the evaluation of Gr(v) and Gr(L'^v + xs[A;]). 
Note first that 

L^'v + xs[k] = (L'^^^V + XS.m-l[k], V„i + S.m-i[k] ■ V + XS,n[k]). 
Therefore, we obtain 

/ Gr{v)Gr(L^y^ + xs[k])dv^ = 
Jt 

Ar(v'))o (vm+s W-Ar(L 

m—l 

V+xSm-i[k])) 
\ / 

+ Ar(v')Ar(L^_iv' + xs„,_i[/c]), 



dVr. 



where the cross terms have dropped because Jj{vm+^i^'))o "^^m — function 
ip. The first term above is equal to A(^.^^^{gk{v')), whereas if the second term is 
integrated over T™~^ we find (Ar, l^'^^r) j^2(jm-iy The result follows since Ar = Gr. 

If m = 2, then moreover Ar = Gr, so that we have Ar = 0. If J := P(r) is an 
interval of unit length, then it is necessarily the case that Ar = M™""^ x J. In this 
case, we simply have Ar = Ar so that Ar = 0. Hence the second term drops in both 
cases. □ 
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7.3. Special case when P(r) = [— ^, There is a class of quantization rules 
[HlllUlTHj. for which G ["^i 5) for all n (for all x), so that the invariant tile F 

satisfies P(r) = [-\,\). These are the "ideal" rules that were mentioned in Section 
n and represent essentially the simplest possible quantization situation. It turns 
out that the spectral measure ^ is quite different in its nature for m = 1 and m > 2. 
For m = 1, we have Gr = Gr- Hence /i is pure-point, and Theorem 17.11 vields 



y— 



■Jnx • 



For m > 2, we simply note that Ar = 0, so that Gr(v) = {vm)o- The fact that 
jj{vm)o = implies Gr = 0. Hence = 0, i.e., /i is absolutely continuous. 
In addition. Proposition 17.51 vields 

Pu[k] = / (s„,_i[fc] • v' + XSm[k]) dv'. 

Jjm-1 

For A: = 0, the argument of the integrand is identically zero, so we obtain pu[0] = 
^^.)^ (0) = On the other hand, for all A: 7^ 0, we find that Pu[k] = since the 
integrand is of the form ^(■)^ {kvm-i + ct) which integrates to zero over the variable 
Vm-i- Therefore, 



Pu[k] 



^ if fc = 0, 



if A; / 0, 

and consequently p, is flat, and equal to ^ times Lebesgue measure on T, and the 
spectral density s is the constant function s(^) = j^. 

These results were previously obtained, in the case m = 1 in and in the case 
m > 2 in 01111. 

8. Analysis of the mean square error 

We are interested in the asymptotical behavior of £(x,0) for a given SA modu- 
lation scheme of order m as the support of (f) increases and its Fourier transform $ 
localizes around zero frequency. There will be two standard choices for 

(1) The ideal low-pass fllter given by 

(2) The sinc^ family given by 

\Msin(7r4) / 
Note that Sinc^j(^) has Fourier coefficients given by 

sinc^^[n] = r^^j[n] := (r^/ * tm * ■ ■ ■ * rM)[n], 

p times 

where tm denotes the rectangular sequence 

1/M, < n < M, 



rM\n 



0, otherwise. 
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It is a standard fact that sinc^^ is a discrete B-spline of degree p — 1. 
We decompose the mean square error £,{x, <j)) as 

£(x, 4)) = £pp(a;, (f)) + £ac(a;, <A) 

which correspond to the additive contributions of ^Upp and ^ac, respectively, in the 
formula (|.S.5j) . Note that both terms are non- negative, and the straightforward 
inequality 

Wtm <P \Si<i/2iO\, V^GT, (8.1) 
implies that for any of these terms it suffices to prove lower bounds for the ideal 
low-pass filter or upper bounds for any sinc^ family. 

8.1. The pure-point contribution 8,pp{x,(f)). Our first formula follows directly 
from plugging the expression for ^pp given by Theorem 17. II in (|,3.5|) : 

£pp(x,(/)) = ^\2sm{7Tnx)\^"'\^{nx)\^ Qr[n] ^ (8.2) 

Before we carry out our analysis of this expression, let us recall some elementary 
facts about Diophantine approximation. For q G M, let ||q|| denote the distance of 
a to the nearest integer, that is ||a|| := min((a), {—a)). The number a is said to 
be (Diophantine) of type r] if rj is the infimum of all numbers a for which 

ll"-"!! ^a,a H'" Vn G Z\{0}. 

Almost every real number (in the sense of Lebesgue measure) is of type 1, the 
smallest attainable type. 

The following theorem shows that for almost every x, if the function Gr has a 
sufficiently regular projection Gr, then the pure-point part of the mean square error 
after sinc^"^^ filtering decays at least as fast as M~^™~^. 

Theorem 8.1. Let x be Diophantine of type rj, and a and (3 he two real numbers 
satisfying < a < 1 and /3 > (1 — ^)ri. If the invariant tile T = Tx of anm 'th order 
TiA modulator with input x satisfies 



Gr[n] 



< \n\-' 



for all n £ Z\{0}, then 

£pp(x,sinc™+i) (8.3) 

for all M. 

Proof. Formula 1)8. 2|) reads 

c r ■ m+U ^ Sin2-+2(^^j„^) ^ 2 

£pp rr,smc^+^ = \^ ^ Grin] . 8.4 

J^zm+z sm. (Tinx) 

neZ\{0} ^ ' 

Given the decay of |G'r[ra]| and the simple fact | sin(7r0)| x ||0||, it suffices to show 
that 

II /\//'r7'rl|2"^+2 1 

Y^\\M^ M". 

^ — ' nx n^P 

n=i " " 
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Note that ||Mnx|| < min(l, M||nx||). Hence we have 
^\\Mnxf^J_ ^ ^\\Mnx\\ 

^ 1 1 T) T* 1 1 ^ 77 



, r n^^ — ' nx " nx r "n^ 

n=l n=l 



< 



n^^llnxlp " 
n=l " " 



Since 2 — a > 1, it suffices to show the convergence of the sum 



00 ^ 

E ^2/3/(2-c 



n2/3/(2-")||nxir 



Let A > be defined by /? = (1 — + -^)- Now, summation by parts shows 
that 



(8.5) 



00 ^ 00 ^ / n ^ 

^ n'?+^||nx|| ^'''^ ^ ri^'+^+i ( ^ lifcrl 

n=l n=l \A;=1 

and furthermore it is well-known j2Ut Ex. 3.11] that 

n ^ 

for any a > rj. Choosing r] + X > a > r], we obtain the convergence of (|8.5() with 
a sum depending on x and A. Since A depends on x, a and P, the result of the 
theorem follows. □ 

Note: The Diophantine condition on x can be removed if Gr is a trigonometric 
polynomial. In this case, H8.4|) reduces to a finite sum, and therefore it is always 
convergent. 

On the other hand, our next result shows that if Gr does not have enough regu- 
larity in a certain sense as specified in the following theorem, then this is the best 
one can get in the sense that there is an everywhere dense set of exceptional values 
of X for which the exponent of the error decay rate is never better than 2m, even 
for the ideal low pass filter. 

Theorem 8.2. Given a SA modulator of order m, let (pM, M = 1,2,..., be a 

sequence of averaging filters such that |<I>j\/(^)| > ci on the interval |^| < C2/M, 
where ci and C2 are positive constants that do not depend on M . There exists a 
dense set E of irrational numbers with the following property: For any x £ E, if 
there exist positive constants f3x and Cx such that the invariant tile T = Tx satisfies 



> Cr\n\ 



for all but finitely many n G then for all 5 > 0, 



limsup £pp(x, <Pm) m2™+'^ = 00. (8.6) 

M— >oo 
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Proof. It suffices to find, for any open interval J, a point x G J with tlie property 
(|8.6|) for all 6 > 0. Given an open interval J, let xq £ J he a, dyadic rational. Let 
I = max(6, d) for the minimum b and d such that 6! — 1 is an upper bound for the 
length of the binary expansion of xq and 2"'^'"'"^ is a lower bound for the distance of 
xq to the boundary of J. Set 

x = xo + ^2-^-. 

k>l 

Then clearly x £ J. It is also a standard fact that x is irrational, in fact x is a 
Liouville number. 

Note that for q > I, we have 

oo oo 
k=q+l k=q+2 

For g = 1, 2, . . . , let ng = 2"' and Mg = [2''-i--'^C2\ . We have 

< 2-'?"' < \\nnx\\ < 2-^-9'+i < (8.7) 

4Mq " ^ " Mq ^ ^ 

The right side of this chain of inequalities implies \^Mq{n'qx)\ > ci by our assumption 
on {4>m}- On the other hand, the left side implies |2 sin(7rnqa;)| > 4||nga::|| > C2/Mq. 
Therefore 

2 



£pp(a;,0AfJ > |2sin(7rnga;)|2'"|$(ngx)p GrK] 



> C,,mM-2™M-^-/'^ (8.8) 

where Cx,m = Cxcfc^"^. The result of the theorem follows by letting q ^ oo and 
therefore exhibiting the subsequence Mq for which (|8.6)) holds for any 5 > 0. □ 

8.2. The absolutely continuous contribution 8,a.c{x,(j)). Let us denote by s the 
Radon-Nikodym derivative of the absolutely continuous spectral measure /iac, i-e., 
d/Uac = ^(Od^- A priori, we know that s G L^(T), which is somewhat weak for what 
we would like to achieve. Our first theorem concerns the decay rate of 

£ac(x,sinc-+^)= / |2sin«)P'"|Sinc-+i(0|'K0d? 
if it is known that s belongs to an space. 

Theorem 8.3. // the measure /iac has density s £ L^iT) for some I < p < oo, then 
£ac(:r,sinc^+^) <^,p P||iP(Tr)M-(2-+i-iM. (8.9) 
Proof. Let p' be the dual index of p, i.e., 1/p + 1/p' = 1. Note that 

|2sin«)|2"'|Sinc^+He)l^ = |2 sin(7rMe)|2"*|SincM(6l^^"^"' (8-10) 

<m |Sincl^(0|M-2-, (8.11) 

so that Holder's inequality yields 

£ac(a;, Sinc^"^^) <m ||s||j;^p(-T) II^^^mILp'cjt) ^ 
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Furthermore, the simple bound |Sinci\/(^)| < min(l, (2M|^|)~^) implies 

||Sincl,||^,,(^) <P' M-Vp', (8.12) 

hence the theorem follows. □ 

On the other hand, it turns out that if s is continuous at 0, then one can calculate 
the exact asymptotics of £ac(2^) sinc^^"^^) without additional assumptions. 

Theorem 8.4. // the spectral density s is continuous at 0, then 

£ac(2;,sinc™+i) = (^^^^ s(0)M-2'"-i + o(M-2™-i). (8.13) 

Proof. The proof has two parts. First part is the easy calculation 

/ |2sin«)|2™|Sinc^+^(e)|2de= f^"") M-^"^"!. (8.14) 
To see this, note that (|8.10() and the definition of SincM(0 imply 

|2sin(7rO|'"|Sinc^+i(C)P = (- ^ ) Yl E e^-^'^-^^^M-^™-^. 

The right hand side is the product of two trigonometric polynomials; the first poly- 
nomial has frequencies only at integer multiples of 2ttM and the second polynomial 
has frequencies between — 27r(M — 1) and 27r(M — 1). The zero frequency term of the 
product is therefore given only by the product of the corresponding zero frequency 
terms, which is equal to 

(e i) M-'"'-' = (T) 

hence the result. 

The second part of the proof concerns the residual term 

/ |2sin«)|'"^|Sinc-+i(C)|'(Ke) " ^0)) d^ 
which is bounded, using (|8.10j) . by 



22m^-2m f Sinc2,(Ok(e)-^(0)|d^ = 22-M-2— 1 / Km-i(ON(0 " s(0)| d^, 
where 

M \ sm(7r4) / 

is the Fejer kernel. The limit 

hm [ KM^i{0\siO-smd^ 



is the Cesaro sum of the Fourier series of the function f{t) = \s{—t) — s(0)| evaluated 
at t = 0. Since / is continuous at 0, the Cesaro sum converges to /(O) = 0, and 
therefore the limit is 0. This concludes the proof. □ 
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Notes: 

(1) A similar calculation shows that for the ideal filter the error has the 
asymptotics given by 

/o N2m+1 

£ac(x,$i^,) = s(0) 1 + 0(M-2— 3) (8.15) 

m+\/ Z 

again assuming that s is continuous at 0. 

(2) The value of s(0) is equal to the sum of its Fourier coefficients 



9. Estimates for second order schemes with w2-connected invariant 

TILES 

Second order SA modulators with f2-connected invariant tiles are interesting 
because the value of x and the function Ar = Gr completely describe the MSE 
behavior via the theorems we have stated in the previous sections. In particular, 
(|7.5|1 provides us with the formula 

Pu[k] = j A(^.-^J^kvi + ^^'^:^^^ x - Xt{vi + kx) + Ar(fi)^ dvi 

= j A^.^ {kv - Xr{v - I + fcf ) + Ar(^ - f - A;f )) d^;, (9.1) 

where we have used the change of variable v = vi + {k + l)x/2 to obtain the second 
representation. 

By Riemann-Lebesgue lemma, we already know that Pu[k] must converge to zero 
as ^ oo since pu[k] = s[k], where s G L^{T) is the spectral density. However, we 
would like to quantify the rate of decay in \k\ as this would then allow us to draw 
conclusions about s. Intuitively speaking, it is not hard to see from this formula 
that the smoother Ar is, the faster Pu[k] must decay in \k\ as \k\ — > oo, since ^{■)^ 
is a zero mean function on T. Our objective in this section is to study this relation 
rigorously. 

Let BV(T) denote the space of functions on T that have bounded variation, where 
II • \\tv denotes the total variation semi- norm, and let A(T) denote the space of 
functions on T with absolutely convergent Fourier series with the norm ||/||a(t) given 
by I/Ml- We have the following lemma, whose proof is given in the Appendix. 

Lemma 9.1. Let f G A(T) and ip be two real valued functions on T, where f has 
zero mean. Consider the integrals 

c[k] = / f{kv + ip{v))dv. (9.2) 
Jt 

The following bounds hold: 

(1) Ifipe BV(T), then for all k G Z\{0}, 

\c[k]\ < ^||/|U(T)||¥^||Ty. (9.3) 
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(2) If ip is dijjerentiable almost everywhere and G BV(T), then for all k E 
Z\{0}, 

\c[k]\ < 1 (^||/||L2(Tr)||99'||Ty + ||/||i^(Tr)||/||i2(Tr)) ■ (9.4) 

Corollary 9.2. Let x be given and T be the invariant tile corresponding to a second 
order SA modulator. Then we have the following: 

(1) // the midpoint function Ar has bounded variation on T, then 

\pu[k]\ < ^^W^^Wtv- (9.5) 

Consequently, one has 

£ac(2;,sin4^) S,e M-5+^ (9.6) 
for any e > 0. // the type r] of x is strictly less than 2, then 

£pp(x,smci,) S,5 (9.7) 

for any < 5 < (2 — r})/r}. 

(2) // the midpoint function Ar has a derivative that has bounded variation on 
T, then 

\pM < Y2 ll^rllw + ■ (9.8) 

In particular, the spectral density s is continuous. Consequently, one has 

£ac(2;,sin4,f) = 6s(0)M~^ + o(M-^), (9.9) 

where 

1 ir2 / 1 1. 



// t/ie i|/pe r] of X is strictly less than A, then 

£pp(x,smci,) S,5 (9.11) 
for any < 5 < min(l, (4 — r])/r]^ . 
Proof. Let 

For each k, define 

^k{v) := -Ar('y - f + fcf ) + Ar(u - § - fcf). 
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For these functions, we have the foUowing exact formulas and bounds: 



11/11 



1 



(9.12) 



A(T) 



12 
1 

12 



11/11 



L°°(T) 



(9.13) 



II/IIl2(t) = 



1 



(9.14) 



ll'/'A.-llTy < 
Ibfellry < 

,J l|2 ^ 




(9.15) 
(9.16) 
(9.17) 



(1) In this case we only know that Ar is of bounded variation. 

The decay estimate ()9.5() simply follows from the bound 1)9. 3|) coupled 
with (I^T^ and (PTT^ . 

Given that the Fourier coefficients pu[k] = s[k] decay like l/k, it follows 
from Riesz-Thorin interpolation theorem that the spectral density s £ LP(T) 
for any p < oo. Therefore Theorem 18.31 implies, with m = 2 and e = 1/p, 
the bound ^TEji . 

For the pure-point estimate, we use Theorem 18 . 1 1 with /? = 1 and m = 2. 
If we define 5 = 1 — q, where a is as defined in Theorem l8.ll then the result 
follows as stated. 

(2) In this case we know that Ar has a derivative that is of bounded variation. 

The decay estimate (|9.8|) follows from the bound ()9.4p coupled with (|9.13j) . 
(|9lHl . and (|9J71) . 

Since pu is summable, it follows that s is continuous. We therefore apply 
Theorem l8.4l to compute the exact asymptotics of £ac(2;, sinc|.^). In this case, 
the nonnegative number s(0) will be bounded by ^ |/>m[A;]|. We simply, add 
up the bounds given by 1)9. 8|) . including the trivial case |yOtj[0]| < ||/||loo(t)- 
This computation yields the bound ()9.1U|) . 

For the pure-point estimate, we again use Theorem 18.11 but now with 
(3 = 2. We define 5 = 1 — a, where a is as defined in Theorem 18.11 and 
note that the condition a < 1 must be imposed, which was automatically 
satisfied in part (1). Then the result follows as stated. 



In this paper, we have covered only a portion of the mathematical problems that 
concern SA quantization. We believe that the following currently unresolved prob- 
lems are interesting both from the dynamical systems standpoint and the engineering 
perspective: 

1. Which maps M are stable? Satisfactory answers of this question would include 
non-trivial sufficient conditions in terms of the quantization rule Q, or in terms of 
the partition n^, and the quantization levels {dj}. 



□ 



10. Further remarks 
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2. Which stable maps M yield single invariant tiles? One can include in this the 
case when F is composed of tiles each of which is invariant under M. In principle, 
each of these invariant tiles would represent a different "mode of operation" . 

3. What is an appropriate generalization of our spectral analysis of mean square 
error when T is composed of more than one tile? 

4. Given the quantization rule, what can be said about the geometric regularity 
of T? We used two types of geometric information about T in deriving our an- 
alytical results on the mean square error asymptotics. The first type concerned 
"shape" (such as fm-connectedness), and the second concerned "regularity" (such 
as the decay of Fourier coefficients of Gt)- At this stage, the relation between the 
quantization rule and these two issues is highly unclear, although we have partial 
understanding in some cases. Even for "linear" rules, there seems to be a wide range 
of possibilities. 

5. What are the universal principles behind tiling? Tiling invariant sets are found 
even when x is rational. In addition, trajectories seem to remain within exact tiles, 
and not just tiles "up to sets of measure zero" . 

Appendix A. On the spectral theory of the map £ 

In this section, we will review some basic facts about the spectral theory of the 
map XL = on T"*, where il^v = Lv + xl, and x is an irrational number. Most 
of what follows below can be derived or generalized from Anzai's work on ergodic 
skew product transformation [H]. 

The eigenfunctions of Uj^. We start by showing that the set of all eigenfunctions 
oiU = U)i is precisely given by the collection of complex exponentials fn-, where 

/„(v) = e2™, n(^Z. 

To see this, let / G //^(T"*) be an eigenfunction of lA with eigenvalue A. Since U is 
unitary, |A| = 1. Consider the Fourier series expansion of / given by 



/(v) = c[n]e^ 



„27rm-v 



Since f = j (^/) ) we have the relation 



A 

= ^ V c[Kn]e2-«(Kn)-ie2.m.v^ 
A — ^ 

where K = (L^^)"*^. Comparing the coefficients, we obtain the equality 

|c[n]| = |c[Kn]|, Vn E Z"*. 

Since / £ L^(T™), we can conclude that c[n] = for any n that is not preserved un- 
der K-? for some positive integer j, for otherwise we would have the infinite sequence 
of coefficients c[n], c[Kn], c[K^n], . . . of equal and strictly positive magnitude. In 
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fact, this conclusion is valid even for / G L^(T™) since in this case it is also true that 
|K-^n| ^ oo as j ^ oo, resulting in a violation of the Riemann-Lebesgue lemma. 

On the other hand, it is a simple exercise to show that the only vectors that satisfy 
n = K-^n for some power j > 1 are those of the form n = (j^i, 0, . . . , 0). Hence, any 
eigenfunction of U depends only on the first variable vi. On the first coordinate vi 
of V, Zj reduces to the irrational rotation by x, and hence as it is well-known, these 
eigenfunctions are nothing but the given complex exponentials {fn}n£Z- According 
to the spectral theorem, these eigenfunctions span the subspace "Kpp of L^(T'"). 

The absolutely continuous spectrum. We shall next show that continuous part 
of the spectrum is in fact absolutely continuous. This is in fact a consequence of 
the fact that there exists an orthonormal basis {V'j.fc : i G G N} of !Kpp with the 
property^ that Utpj,k = 4'j+i,k for all j and k. First we will construct such a basis, 
and then we shall prove the statement on the absolute continuity. 
From the discussion above, we know that the complex exponentials 

/„(v) = e^™, n G Z^V {Z x {0}"^-^) , 
form an orthonormal complete set in "K^p. Note also that 

Therefore we consider the orbit of each n G Z™ under L""", given by 

0(n) = |(LT)'n| . 
I ^ ^ J jez 

It is easy to see that each n G Z x {O}™^"*^ is a fixed point of L""" and every other n is 
such that the orbit is an infinite sequence of distinct points in Z™\ (Z x {0}™""^) . 
Divide Z"^\ (Z x {0}™'""^) into equivalence classes of orbits 0(n,fc), A; G N (note that 
distinct orbits do not intersect because is invertible), and define 

^o,k = /nfc, i^j,k = ^iV'Cfc, j G A; G N. 

Each ipj^k is equal to some complex exponential multiplied by a complex num- 
ber of unit magnitude. The collection of V'jjfc is distinct, and all frequencies n G 
Z"*\ (Z X {0}™"^) appear, hence {'ipj,k}jez,km form an orthonormal basis of !Kpp 
with the property that Uil^j^k = V'i+i.fc- 

Let us show that every spectral measure is absolutely continuous on !Kpp. Let g 
and h be arbitrary functions with representations g = o,j,k''Pj,k and h = ^ bj,k'<Pj,k- 
Let the functions Ak and Bk be defined on T for each k with Fourier coefficients 
{a.j^k)j& and {bj^k)j€Z, respectively. From orthogonality, we have 

AkiOl^d^ < oo, 

and similarly for h and Bj.. 

^I.e., £ has countable Lebesgue spectrum on Mjjp. 
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Now, we have U^h = ^ bj^k'4'j+n,k, so that 



k j 



k ■^'^ j 

^e2™« \y2MOBkio] 



Here, norm of the function J2k ^^kiO^kiC) is bounded by ||9||l2 ||/i||/^2 because of 
Cauchy-Schwarz inequahty, hence finite. Therefore, we have that the measure Vg^h 
defined by the inner products {g,U"'h) is absolutely continuous. 



Appendix B. Proof of Lemma [^7T] 
Let us start by writing f{t) = ^ /[n]e^'^*"* so that we have 



c[k]=S2f[n] [ e2™('=''+'^(''))du. 



(B.l) 



where we have changed the order of summation and integration. Applying integra- 
tion by parts we obtain 



1 



2irink 
1 



/ ^-Kinkv ^ 




IT 





k 



2iTinkv 2iTinip{v) 



dip{v). 



(B.2) 



T 



Part (1). For the integral in HB.2|) . we use the bound 



k 



and we simply get 



1 f ^27Tinkv^2nin^{v)^^(^^^^ 



< 



1 



ly'llTvl/NI < — ||v7||Ty||/||A(T) 



Part (2). Let ip be differentiable and ip' G BV(T). Substitute (l(p{v) = ip'{v)dv 
and apply another integration by parts to (|B.2|1 obtain 



k 



1 



Now, 



k{2-Kink) Jj 
2™^(^')d99'(r;) + ((^'(f))2(27rzn)e2™^('') dv, 
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SO that substituting the above two formulas together with (|B.2|) in (|B.1|) . we get 



For the first part of this sum we use 



(B.3) 



< / w{v)\ = \y\\Tv, 



T 



and 



so that 



E 



l/NI 



27r|n| 



< 



^ {2Tm) 



1/2 ^ 1/2 
|2 1 



L2(T)j 



E 



27rm 



g27rin(A;i;+ip(?))) 



d¥p'(7;) 



< 



On the other hand, the second term reduces to 



{ip'{v)ff{kv + ip{v))dv. 



We bound this integral by ||/||L°°(T)llv''lli2(']r')- Combining these, the expression of 
(IB .31) can now be bounded from above in absolute value as 



\c[k]\ < 
concluding the proof. 



1 

fc2 



1 



L^{T)\\^'\\tV + ll/l|L°°(T)||v'llL2(Tr) ) , 



□ 



Acknowledgements 



The authors would like to thank Ingrid Daubechies, Ron DeVore, Ozgiir Yilmaz 
and Yang Wang for conversations on the topic of SA quantization, tiling, and related 
issues. 

References 

[1] R. L. Adler, B. P. Kitchens, M. Martens, C. P. Tresser, C. W. Wu, "The mathematics of 

halftoning," IBM J. Res. & Dev., vol. 47, no. 1, Jan 2003. 
[2] D. Anastassiou, "Error diffusion coding for A/D conversion," IEEE Trans, on Circuits and 

Systems, vol. 36, no. 3, pp. 1175-1186, Sept. 1989. 
[3] H. Anzai, "Ergodic Skew Product Transformation on the Torus," Osaka Math. J., 3 (1951), 

pp. 83-99. 

[4] T. Bernard, "From E — A modulation to digital halftoning of images," Proc. IEEE Int. Conf. 

on Acoustics, Speech and Signal Proc, May 1991, pp. 2805-2808, Toronto. 
[5] R. Calderbank and I. Daubechies, "The Pros and Cons of Democracy," IEEE Trans, in Inform. 

Theory, vol. 48, pp. 1721-1725, June 2002. 



ERGODIC DYNAMICS IN EA QUANTIZATION 



33 



[6] J. C. Candy and G. C. Temes, Eds., OversampUng Delta-Sigma Data Converters: Theory, 

Design and Simulation, IEEE Press, 1992. 
[7] H. Furstenberg, "Strict ergodicity and transformations of the torus," Amer. J. Math., vol. 83, 

pp. 573-601, 1961. 

[8] W. Chou, P. W. Wong, and R. M. Gray, "Multistage EA modulation," IEEE Trans. Inform. 
Theory, vol. 35, pp. 784-796, July 1989. 

[9] 1. Daubechies, R. DeVore, "Reconstructing a Bandlimited Function From Very Coarsely Quan- 
tized Data: A Family of Stable Sigma-Delta Modulators of Arbitrary Order", Ann. of Math., 
vol. 158, no. 2, pp. 643-674, Sept. 2003. 
[10] R. M. Gray, "Oversampled sigma-delta modulation," IEEE Trans, on Comm., vol. COM-35, 
pp. 481-489, May 1987. 

[11] R. M. Gray, "Spectral Analysis of Quantization Noise in a Single-Loop Sigma-Delta Modulator 

with dc Input," IEEE Trans, on Comm., vol. COM-37, pp. 588-599, June 1989. 
[12] C. S. Giintiirk, Harmonic Analysis of Two Problems in Signal Quantization and Compression, 

Ph.D. thesis, Princeton University, 2000. 
[13] C. S. Giintiirk "Approximating a Bandlimited Function Using Very Coarsely Quantized Data: 

Improved Error Estimates in Sigma-Delta Modulation", J. Amer. Math. Soc, posted on August 

1, 2003, PII S 0894-0347(03)00436-3 (to appear in print). 
[14] C. S. Giintiirk and N. T. Thao, "Refined Analysis of MSE in Second Order Sigma-Deha 

Modulation with DC Inputs," submitted to IEEE Transactions on Information Theory, in 

revision. 

[15] C. S. Giintiirk, "One-Bit Sigma-Delta Quantization with Exponential Accuracy," Comm. Pure 
AppL Math., vol. 56, pp. 1608-1630, no. 11, 2003. 

[16] N. He, F. Kuhlmann, and A. Buzo, "Multi-loop EA quantization," IEEE Trans. Inform. 
Theory, vol. 38, pp. 1015-1028, May 1992. 

[17] A. Katok, and B. Hasselblatt, Introduction to the Modem Theory of Dynamical Systems, Cam- 
bridge University Press, 1995. 

[18] Y. Katznelson, An Introduction to Harmonic Analysis, John Wiley & Sons, 1968 (reprint: 
Dover Pubis. Inc.). 

[19] T. D. Kite, B. L. Evans, A. C. Bovik, and T. L. ScuUey, "Digital image halftoning as 2-D 
Delta-Sigma modulation", Proc. IEEE Int. Conf. on Image Proc, vol. I, pp. 799-802, Oct. 
26-29, 1997, Santa Barbara, CA. 

[20] L. Kuipers and H. Niederreiter, Uniform Distribution of Sequences, Wiley, 1974. 

[21] S. R. Norsworthy, R. Schreier, and G. C. Temes, Eds., Delta-Sigma Data Converters: Theory, 
Design and Simulation, IEEE Press, 1996. 

[22] W. Parry, Topics in Ergodic Theory, Cambridge University Press, 1981. 

[23] R. Schreier, M. V. Goodson, and B. Zhang, "An algorithm for computing convex positively 
invariant sets for delta-sigma modulators," IEEE Trans, on Circuits and Systems, I, vol. 44, 
pp. 38-44, Jan. 1997. 

[24] N. T. Thao, "Breaking the feedback loop of a class of EA A/D converters," IEEE Trans. 

Signal Processing, submitted. 
[25] R. Uhchney, Digital Halftoning, MIT Press, Cambridge, 1987. 

[26] O. Yilmaz, "Stability analysis for several second-order sigma-delta methods of coarse quanti- 
zation of bandlimited functions," Constr. Approx. 18 (2002), no. 4, 599-623. 

Department of Electrical Engineering, City College and Graduate School, City 
University of New York, Convent Avenue at 138th Street, New York, NY 10031 
E-mail address: thao@ee-mail.engr.ccny.cuny.edu 

Courant Institute of Mathematical Sciences, New York University, 251 Mercer 
Street, New York, NY 10012. 

E-mail address: gunturk@cims.nyu.edu 



